MDEV-40063 Corruption due to race in SET GLOBAL innodb_log_archive=ON by dr-m · Pull Request #5253 · MariaDB/server

dr-m · 2026-06-18T05:38:12Z

There was a race condition between log_t::write_checkpoint() and the execution of SET GLOBAL innodb_log_archive=ON (enabling log archiving). We had wrongly allowed the concurrent execution of log_t::set_archive() and log_t::write_checkpoint(). The result was that log_sys.next_checkpoint_no was corrupted. This could have broken crash recovery.

log_t::write_checkpoint(): When we are releasing log_sys.latch while durably writing the checkpoint header block, assign log_sys.resize_log to log_sys.log to inform other threads that a checkpoint is in progress. Previously, we only did this when innodb_log_archive=ON holds.

log_t::resize_start(): Relax a debug assertion for the logic change.

This addresses this problem found in #4817 and originally introduced in #4405.

CLAassistant · 2026-06-18T05:38:19Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

gemini-code-assist

Code Review

This pull request refactors the checkpoint writing logic in InnoDB (buf0flu.cc and log0log.cc) to block concurrent set_archive() calls by setting resize_log = log earlier and handling its cleanup. The review feedback suggests refactoring the control flow to avoid using a goto statement to jump into an if block, which would improve code readability and maintainability.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-18T05:39:46Z

  if (!archive)
+  {
    archived_lsn= end_lsn;
+  checkpoint_completed:
+    if (resize_log.m_file == log.m_file)
+    {
+      /* We may have assigned resize_log= log to keep set_archived() out. */
+#ifdef HAVE_PMEM
+      ut_ad(!is_mmap());
+#endif
+      resize_log.m_file= OS_FILE_CLOSED;
+    }
+  }


Jumping into an if block using goto is a highly discouraged control flow pattern in C++ that hurts readability and maintainability. We can completely avoid the goto statement and the checkpoint_completed label by restructuring the code. Since resize_log.m_file != log.m_file is guaranteed when archive_header_was_reset is true, we can safely place the resize_log.m_file cleanup block directly after the if-else chain. This simplifies the logic and makes it much cleaner.

if (!archive) { archived_lsn= end_lsn; }

gemini-code-assist · 2026-06-18T05:39:47Z

+  else
+    goto checkpoint_completed;


By placing the cleanup block here after the if-else chain, we can completely eliminate the goto checkpoint_completed; statement and the corresponding label inside the if (!archive) block. This makes the control flow linear and much easier to follow.

if (resize_log.m_file == log.m_file) { /* We may have assigned resize_log= log to keep set_archived() out. */ #ifdef HAVE_PMEM ut_ad(!is_mmap()); #endif resize_log.m_file= OS_FILE_CLOSED; }

There was a race condition between log_t::write_checkpoint() and the execution of SET GLOBAL innodb_log_archive=ON (enabling log archiving). We had wrongly allowed the concurrent execution of log_t::set_archive() and log_t::write_checkpoint(). The result was that log_sys.next_checkpoint_no was corrupted. This could have broken crash recovery. log_t::write_checkpoint(): When we are releasing log_sys.latch while durably writing the checkpoint header block, assign log_sys.resize_log to log_sys.log to inform other threads that a checkpoint is in progress. Previously, we only did this when innodb_log_archive=ON holds. log_t::resize_start(): Relax a debug assertion for the logic change. Tested by: Matthias Leich

dr-m requested a review from Thirunarayanan June 18, 2026 05:38

dr-m self-assigned this Jun 18, 2026

dr-m added the MariaDB Corporation label Jun 18, 2026

gemini-code-assist Bot reviewed Jun 18, 2026

View reviewed changes

dr-m force-pushed the MDEV-40063 branch from 01095c8 to eb0f328 Compare June 18, 2026 06:21

Thirunarayanan approved these changes Jun 18, 2026

View reviewed changes

dr-m merged commit eb0f328 into 13.0 Jun 18, 2026
16 of 19 checks passed

dr-m deleted the MDEV-40063 branch June 18, 2026 07:25

dr-m mentioned this pull request Jun 18, 2026

MDEV-14992 BACKUP SERVER to mounted file system #4817

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MDEV-40063 Corruption due to race in SET GLOBAL innodb_log_archive=ON#5253

MDEV-40063 Corruption due to race in SET GLOBAL innodb_log_archive=ON#5253
dr-m merged 1 commit into
13.0from
MDEV-40063

dr-m commented Jun 18, 2026

Uh oh!

CLAassistant commented Jun 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

dr-m commented Jun 18, 2026

Uh oh!

CLAassistant commented Jun 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants